Update README.md to include workflow diagram #1

carlhiggs · 2024-04-16T03:58:32Z

Proposed inclusion of a mermaid workflow diagram in the readme markdown to illustrate the order in which files should be run. This could be expanded to add data outputs, potentially.

This looks a bit like this (the diagram is the new bit):

Proposed inclusion of a mermaid workflow diagram in the readme markdown to illustrate the order in which files should be run. This could be expanded to add data outputs, potentially.

StevePem · 2024-04-17T00:23:37Z

Hi Carl - looks great, but needs to change in one respectplease. intervention cycling speed.R doesn't depend on baseline, so should be in a row on its own. Can we have two separate rows:
(1) baseline.R -> intervention destinations.R -> analysis.R
(2) intervention cycling speed.R

Not sure whether you think this would look best as 2 rows in a single Workflow box, or 2 separate boxes - I don't mind.

I think it makes sense to continue to say that they both import functions as required. (Currently, intervention cycling speed.R imports the functions but doesn't actually use them - however I'm not planning to change this.)

(I had a go at changing it, but it didn't look good, so I thought I'd leave it to you if that's OK.)

carlhiggs · 2024-04-17T00:49:36Z

Hi Carl - looks great, but needs to change in one respectplease. intervention cycling speed.R doesn't depend on baseline, so should be in a row on its own. Can we have two separate rows: (1) baseline.R -> intervention destinations.R -> analysis.R (2) intervention cycling speed.R

Hi Steve, thanks for checking this out and for the clarification. Just wondering, is the output file ../output/edgesMelbourneUpdatedSpeed.gpkg likely to be used in analysis.R in the future, e.g. to evaluate its impact?

If it were, we could sketch this perhaps as,

flowchart TD
    subgraph Workflow
        direction LR 
        baseline.R --> a(intervention destinations.R)
        a --> analysis.R
        intervention.cycling.speed.R --> analysis.R
    end
    Workflow<-- imports as required ---c(functions/*.R)

Otherwise, we could do something like this,

flowchart TD
    subgraph WorkflowA[Destination allocation Intervention]
        direction LR 
        baseline.R --> a(intervention destinations.R)
        a --> analysis.R
    end
    subgraph WorkflowB[Cycling speed Intervention]
        direction LR 
        intervention.cycling.speed.R
    end
    WorkflowA<-- imports as required ---c(functions/*.R)
    WorkflowB -- imports as required ---c(functions/*.R)

what do you think?

carlhiggs · 2024-04-17T00:51:15Z

I suppose the other question would be, how important/useful do you think it would be to include data in this?

It could over-complicate things, but then again, it could show which files are generated from which steps, and that could be useful

Would love to hear your thoughts on this @StevePem

carlhiggs · 2024-04-17T01:22:36Z

Hi @StevePem , if we included only main code and data in the workflow diagram, and had a file like data-resources.yml that detailed input data in a standard way across projects (I'll try to provide a demo this week), we could mock this up in this kind of way:

flowchart TD
    subgraph WorkflowA[Destination allocation Intervention]
        direction LR 
        baseline.R --> a(intervention destinations.R)
        a --> analysis.R
    end
    subgraph WorkflowB[Cycling speed Intervention]
        direction LR 
        intervention.cycling.speed.R
    end
data-resources.yml --- WorkflowA 
data-resources.yml --- WorkflowB
WorkflowA --> o1[(output/intervention locations.sqlite)]
WorkflowA --> o2([output/intervention tables.xlsx])
WorkflowB --> o3([output/edgesMelbourneUpdatedSpeed.gpkg])

Maybe its not that important / useful to include all the individual data and sub-functions. Those are details that someone could check out either in data-resources.yml (or readme.MD!) and the code itself for functions.

What do you think about something high level like this to illustrate workflow and outputs?

StevePem · 2024-04-17T04:52:49Z

Hi @carlhiggs - thanks for these comments. To respond (and with a few other related thoughts):

It's not currently contemplated that the outputs of intervention cycling speed.R will feed into analysis.R. Rather: (1) intervention locations.sqlite (produced by intervention destinations.R) and the output of intervention cycling speed.R' will be taken up in the broader JIBE model, and (2) the outputs of analysis.R` are for the paper I'm writing with Belen and Tayebeh.
So in your first comment, it's the second diagram (after 'Otherwise, we could do something like this') that is correct.
I don't have a fixed view about whether to record details of the data in the diagram. Though I lean towards thinking that if we've already got a readme text description and/or a .yml file that lists out the data, it's superfluous to copy it out again in the diagram. I guess you'll want to adopt a consistent approach across the various repos.
I like the final diagram in your comments, showing a collective resource for data (rather than listing it all out) and the final outputs. I need to update Atefeh's spreadsheet to show which are 'intermediate' outputs and which are 'final'. I think you'd only want to include 'final' outputs in the diagram.
I'm currently thinking that the 'final' outputs from the destination allocation workflow will be intervention locations.sqlite, baseline assessment.xlsx, accessibility tables.xlsx and underutilisation tables.xlsx. So I think those are the ones that should be shown in the diagram. The others will all be intermediate. However, it's possible I could change my mind on some of them. Certainly some of the intermediate outputs are needed for visualisations that we are contemplating for various purposes.
The final output from the cycling speed workflow is currently contemplated as edgesMelbourneUpdated.gpkg, with 2 layers (residential speed reduced to 40 km/h and 30 km/h). I'll be creating it from the final version of the network produced by the UK team when that's done, and the final name and format will probably match whatever name/format the UK team use. I'll make a decision on this when the final network is ready.
To clarify that last point - I'm currently doing the accessibility and underutilisation analysis on the basis of the intermediate 'clipped network' (I would use the final network if it were ready, but it's not; and the final network won't differ from the clipped network in any way that is material to the analysis). Whereas the cycling speed intervention, which is very simple and will take about 5 minutes to run, needs to be based on the final network.

carlhiggs · 2024-04-17T07:09:54Z

Sounds good @StevePem, thanks for the advice and update. Perhaps we could hold off on this for now and update once outputs are finalised, and we agree if/how data could be listed in a configuration file. Let's discuss tomorrow.

Update README.md to include workflow diagram

9dfa456

Proposed inclusion of a mermaid workflow diagram in the readme markdown to illustrate the order in which files should be run. This could be expanded to add data outputs, potentially.

carlhiggs requested a review from StevePem April 16, 2024 03:58

StevePem added 2 commits April 17, 2024 10:08

updated diagram

a292b06

restoring previious diagram

9f46632

StevePem added 2 commits April 18, 2024 22:58

general update

86bae71

node distances results

1b6fc70

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update README.md to include workflow diagram #1

Update README.md to include workflow diagram #1

carlhiggs commented Apr 16, 2024 •

edited

Loading

StevePem commented Apr 17, 2024 •

edited

Loading

carlhiggs commented Apr 17, 2024

carlhiggs commented Apr 17, 2024 •

edited

Loading

carlhiggs commented Apr 17, 2024 •

edited

Loading

StevePem commented Apr 17, 2024 •

edited

Loading

carlhiggs commented Apr 17, 2024

Update README.md to include workflow diagram #1

Are you sure you want to change the base?

Update README.md to include workflow diagram #1

Conversation

carlhiggs commented Apr 16, 2024 • edited Loading

StevePem commented Apr 17, 2024 • edited Loading

carlhiggs commented Apr 17, 2024

carlhiggs commented Apr 17, 2024 • edited Loading

carlhiggs commented Apr 17, 2024 • edited Loading

StevePem commented Apr 17, 2024 • edited Loading

carlhiggs commented Apr 17, 2024

carlhiggs commented Apr 16, 2024 •

edited

Loading

StevePem commented Apr 17, 2024 •

edited

Loading

carlhiggs commented Apr 17, 2024 •

edited

Loading

carlhiggs commented Apr 17, 2024 •

edited

Loading

StevePem commented Apr 17, 2024 •

edited

Loading